103 research outputs found

    Using graph theory to analyze biological networks

    Get PDF
    Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system

    A Normalized Tree Index for identification of correlated clinical parameters in microarray experiments

    Get PDF
    Martin C, Tauchen A, Becker A, Nattkemper TW. A Normalized Tree Index for identification of correlated clinical parameters in microarray data. BioData Mining. 2011;4(1): 2.BACKGROUND: Measurements on gene level are widely used to gain new insights in complex diseases e.g. cancer. A promising approach to understand basic biological mechanisms is to combine gene expression profiles and classical clinical parameters. However, the computation of a correlation coefficient between high-dimensional data and such parameters is not covered by traditional statistical methods. METHODS: We propose a novel index, the Normalized Tree Index (NTI), to compute a correlation coefficient between the clustering result of high-dimensional microarray data and nominal clinical parameters. The NTI detects correlations between hierarchically clustered microarray data and nominal clinical parameters (labels) and gives a measurement of significance in terms of an empiric p-value of the identified correlations. Therefore, the microarray data is clustered by hierarchical agglomerative clustering using standard settings. In a second step, the computed cluster tree is evaluated. For each label, a NTI is computed measuring the correlation between that label and the clustered microarray data. RESULTS: The NTI successfully identifies correlated clinical parameters at different levels of significance when applied on two real-world microarray breast cancer data sets. Some of the identified highly correlated labels confirm the actual state of knowledge whereas others help to identify new risk factors and provide a good basis to formulate new hypothesis. CONCLUSIONS: The NTI is a valuable tool in the domain of biomedical data analysis. It allows the identification of correlations between high-dimensional data and nominal labels, while at the same time a p-value measures the level of significance of the detected correlations

    IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

    Full text link
    Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community

    Which clustering algorithm is better for predicting protein complexes?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-Protein interactions (PPI) play a key role in determining the outcome of most cellular processes. The correct identification and characterization of protein interactions and the networks, which they comprise, is critical for understanding the molecular mechanisms within the cell. Large-scale techniques such as pull down assays and tandem affinity purification are used in order to detect protein interactions in an organism. Today, relatively new high-throughput methods like yeast two hybrid, mass spectrometry, microarrays, and phage display are also used to reveal protein interaction networks.</p> <p>Results</p> <p>In this paper we evaluated four different clustering algorithms using six different interaction datasets. We parameterized the MCL, Spectral, RNSC and Affinity Propagation algorithms and applied them to six PPI datasets produced experimentally by Yeast 2 Hybrid (Y2H) and Tandem Affinity Purification (TAP) methods. The predicted clusters, so called protein complexes, were then compared and benchmarked with already known complexes stored in published databases.</p> <p>Conclusions</p> <p>While results may differ upon parameterization, the MCL and RNSC algorithms seem to be more promising and more accurate at predicting PPI complexes. Moreover, they predict more complexes than other reviewed algorithms in absolute numbers. On the other hand the spectral clustering algorithm achieves the highest valid prediction rate in our experiments. However, it is nearly always outperformed by both RNSC and MCL in terms of the geometrical accuracy while it generates the fewest valid clusters than any other reviewed algorithm. This article demonstrates various metrics to evaluate the accuracy of such predictions as they are presented in the text below. Supplementary material can be found at: <url>http://www.bioacademy.gr/bioinformatics/projects/ppireview.htm</url></p

    Markov Chain Ontology Analysis (MCOA)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Biomedical ontologies have become an increasingly critical lens through which researchers analyze the genomic, clinical and bibliographic data that fuels scientific research. Of particular relevance are methods, such as enrichment analysis, that quantify the importance of ontology classes relative to a collection of domain data. Current analytical techniques, however, remain limited in their ability to handle many important types of structural complexity encountered in real biological systems including class overlaps, continuously valued data, inter-instance relationships, non-hierarchical relationships between classes, semantic distance and sparse data.</p> <p>Results</p> <p>In this paper, we describe a methodology called Markov Chain Ontology Analysis (MCOA) and illustrate its use through a MCOA-based enrichment analysis application based on a generative model of gene activation. MCOA models the classes in an ontology, the instances from an associated dataset and all directional inter-class, class-to-instance and inter-instance relationships as a single finite ergodic Markov chain. The adjusted transition probability matrix for this Markov chain enables the calculation of eigenvector values that quantify the importance of each ontology class relative to other classes and the associated data set members. On both controlled Gene Ontology (GO) data sets created with Escherichia coli, Drosophila melanogaster and Homo sapiens annotations and real gene expression data extracted from the Gene Expression Omnibus (GEO), the MCOA enrichment analysis approach provides the best performance of comparable state-of-the-art methods.</p> <p>Conclusion</p> <p>A methodology based on Markov chain models and network analytic metrics can help detect the relevant signal within large, highly interdependent and noisy data sets and, for applications such as enrichment analysis, has been shown to generate superior performance on both real and simulated data relative to existing state-of-the-art approaches.</p

    Alcohol use and abuse in training conscripts of the Hellenic navy

    Get PDF
    OBJECTIVES: Alcohol abuse and addiction are big current problems of the developed world having multivariate causality and multiple effects. Alcohol abuse in young people is a matter of central importance due to its wide range long lasting effects, especially so in Greece where the problem has only recently started growing. The Hellenic Navy is interested in the complications of alcohol abuse in training conscripts. Because young conscripts will be placed in demanding positions, but also because in Greece the military service is obligatory and represents an important period for the socialization of young men. METHODS: In the present study, levels of alcohol use and abuse were measured in a sample of 650 male training conscripts of the Hellenic Navy. The tools used are: (a) two questionnaires measuring frequency and quantity of alcohol consumption and psychosocial variables, (b) the CAGE test, which is a questionnaire measuring hidden alcoholism. RESULTS: 38,1% conscripts were characterized problematic drinkers according the adolescents criteria. Additional psychological complications were related to alcohol use. Using the stricter criterion for adults (plus psychological complications) 8.9% were found to be problematic drinkers. The use of CAGE questionnaire which is measuring hidden alcoholism, identified 16% of the total sample as hidden alcoholics. DISCUSSION: The findings regarding unregular levels of alcohol use and abuse are presented as well as their relation to psychosocial complications and to demographic characteristics. The results are discussed in the light of Creek and international bibliography

    An Obligatory Role of Mind Bomb-1 in Notch Signaling of Mammalian Development

    Get PDF
    Background. The Notch signaling pathway is an evolutionarily conserved intercellular signaling module essential for cell fate specification that requires endocytosis of Notch ligands. Structurally distinct E3 ubiquitin ligases, Neuralized (Neur) and Mind bomb (Mib), cooperatively regulate the endocytosis of Notch ligands in Drosophila. However, the respective roles of the mammalian E3 ubiquitin ligases, Neur1, Neur2, Mib1, and Mib2, in mammalian development are poorly understood. Methodology/Principal Findings. Through extensive use of mammalian genetics, here we show that Neur1 and Neur2 double mutants and Mib2-1- mice were viable and grossly normal. In contrast, conditional inactivation of MW in various tissues revealed the representative Notch phenotypes: defects of arterial specification as deltalike4 mutants, abnormal cerebellum and skin development as jagged1 conditional mutants, and syndactylism as jagged2 mutants. Conclusions/Significance. Our data provide the first evidence that Mib1 is essential for Jagged as well as Deltalike ligand-mediated Notch signaling in mammalian development, while Neur1, Neur2, and Mib2 are dispensable.open504

    Word add-in for ontology recognition: semantic enrichment of scientific literature

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles.</p> <p>Results</p> <p>The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at <url>http://www.codeplex.com/UCSDBioLit</url>.</p> <p>Conclusions</p> <p>The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.</p

    The Functions of Auxilin and Rab11 in Drosophila Suggest That the Fundamental Role of Ligand Endocytosis in Notch Signaling Cells Is Not Recycling

    Get PDF
    Notch signaling requires ligand internalization by the signal sending cells. Two endocytic proteins, epsin and auxilin, are essential for ligand internalization and signaling. Epsin promotes clathrin-coated vesicle formation, and auxilin uncoats clathrin from newly internalized vesicles. Two hypotheses have been advanced to explain the requirement for ligand endocytosis. One idea is that after ligand/receptor binding, ligand endocytosis leads to receptor activation by pulling on the receptor, which either exposes a cleavage site on the extracellular domain, or dissociates two receptor subunits. Alternatively, ligand internalization prior to receptor binding, followed by trafficking through an endosomal pathway and recycling to the plasma membrane may enable ligand activation. Activation could mean ligand modification or ligand transcytosis to a membrane environment conducive to signaling. A key piece of evidence supporting the recycling model is the requirement in signaling cells for Rab11, which encodes a GTPase critical for endosomal recycling. Here, we use Drosophila Rab11 and auxilin mutants to test the ligand recycling hypothesis. First, we find that Rab11 is dispensable for several Notch signaling events in the eye disc. Second, we find that Drosophila female germline cells, the one cell type known to signal without clathrin, also do not require auxilin to signal. Third, we find that much of the requirement for auxilin in Notch signaling was bypassed by overexpression of both clathrin heavy chain and epsin. Thus, the main role of auxilin in Notch signaling is not to produce uncoated ligand-containing vesicles, but to maintain the pool of free clathrin. Taken together, these results argue strongly that at least in some cell types, the primary function of Notch ligand endocytosis is not for ligand recycling
    corecore